Optimization in the HIP environment must be treated as a rigorous empirical discipline rather than a series of intuitive guesses. By adopting a systematic workflow, developers ensure that every code modification is justified by data, moving performance engineering away from "optimization superstition" toward a repeatable, scientific cycle of hypothesis and verification.
The 6-Step Workflow
HIP performance guidelines recommend a systematic sequence:
- Measure a baseline: Determine current execution time and throughput.
- Profile the program: Use
rocprofv3to collect hardware counters. - Identify the bottleneck: Determine if you are compute-bound, memory-bound, or latency-bound.
- Apply targeted optimizations: Focus only on the identified bottleneck.
- Re-measure: Verify if the change actually improved performance.
- Iterate: Repeat the process until goals are met.
Avoid Optimization Superstitions
Performance gains should be reproducible results from specific hardware interactions. Avoid these anti-patterns:
- Changing kernel code before measuring current performance.
- Tuning block size without knowing if the kernel is memory-bound.
- Chasing occupancy numbers with no proof they matter to the specific workload.
TERMINAL
bash — 80x24
> Ready. Click "Run" to execute.
>